Multiple topic identification in telephone conversations
نویسندگان
چکیده
This paper deals with the automatic analysis of conversations between a customer and an agent in a call centre of a customer care service. The purpose of the analysis is to hypothesize themes about problems and complaints discussed in the conversation. Themes are defined by the application documentation topics. A conversation may contain mentions that are irrelevant for the application purpose and multiple themes whose mentions may be interleaved portions of a conversation that cannot be well defined. Two methods are proposed for multiple theme hypothesization. One of them is based on a cosine similarity measure using a bag of features extracted from the entire conversation. The other method introduces the concept of thematic density distributed around specific word positions in a conversation. In addition to automatically selected words, word bigrams with possible gaps between successive words are also considered and selected. Experimental results show that the results obtained with the proposed methods outperform the results obtained with support vector machines on the same data. Furthermore, using the theme skeleton of a conversation from which thematic densities are derived, it will be possible to extract components of an automatic conversation report to be used for improving the service performance.
منابع مشابه
Techniques for rapid and robust topic identification of conversational telephone speech
In this paper, we investigate the impact of automatic speech recognition (ASR) errors on the accuracy of topic identification in conversational telephone speech. We present a modified TF-IDF feature weighting calculation that provides significant robustness under various recognition error conditions. For our experiments we take conversations from the Fisher corpus to produce 1-best and lattice ...
متن کاملConfidence-Based Techniques for Rapid and Robust Topic Identification of Conversational Telephone Speech
We investigate the impact of automatic speech recognition errors on the accuracy of topic identification in conversational telephone speech. We present a modified TF-IDF featureweighting calculation that provides significant robustness under various recognition error conditions. For our experiments we take conversations from the Fisher corpus to produce 1-best and lattice outputs using one reco...
متن کاملTheme identification in human-human conversations with features from specific speaker type hidden spaces
This paper describes a research on topic identification in a realworld customer service telephone conversations between an agent and a customer. Separate hidden spaces are considered for agents, customers and the combination of them. The purpose is to separate semantic constituents from the speaker types and their possible relations. Probabilities of hidden topic features are then used by separ...
متن کاملSubspace Gaussian mixture models for dialogues classification
The main objective of this paper is to identify themes from dialogues of telephone conversations in a real-life customer care service. In order to capture significant semantic content in spite of high expression variability, features are extracted in a large number of hidden spaces constructed with a Latent Dirichlet Allocation (LDA) approach. Multiple views of a spoke document can then be repr...
متن کاملDetecting Health Related Discussions in Everyday Telephone Conversations for Studying Medical Events in the Lives of Older Adults
We apply semi-supervised topic modeling techniques to detect health-related discussions in everyday telephone conversations, which has applications in large-scale epidemiological studies and for clinical interventions for older adults. The privacy requirements associated with utilizing everyday telephone conversations preclude manual annotations; hence, we explore semi-supervised methods in thi...
متن کامل